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(57) Abrege 

L'invention concerne une nouvelle technique d'approche des problemes d'optimisation, en fonction d f un travail ant§rieur 
effectue sur des configurations d'extinction en macroevolution, designees algorithmes macroevolutife (MA). Par opposition a 
revolution du niveau de la population, utilisee dans les algorithmes genetiques (GA) standards, on utilise revolution a. un 
niveau taxinomique plus elev6 sous forme d'une m&aphore sous-jacente. Ladite technique explorte la presence de liaisons entre 
les especes, representant des solution candidates au probleme de ('optimisation. Afin de tester Pefficacite de cette technique, 
on compare la performance des MA par rapport aux GA avec une selection de tournoL Cette technique s'avere constituer une bonne 
alternative aux GA standards, par realisation d'une recherche monotone rapide dans I'espace de solution, m§me pour des 
dimensions de population tres limitees. L'approche thSorique presentee montre que les forces dynamiques de base des MA sont 
proches d'un modele ecologique de competition entre especes multiples. 
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A METHOD AND SYSTEM FOR PERFORMING 
OPTIMIZATION ON FITNESS LANDSCAPES 

10 



1 Field of the Invention 

15 

The present invention relates generally to performing optimization on fitness land- 
scapes using macroevolutionary algorithms. More particularly, the present invention 
comprises species to represent candidate solutions to the optimization problem and 
2Q & plurality of rules and operators that govern the extinction and colonization of 

species. 



2 Background 

Evolutionary computataion[l) comprises techniques that has been inspired in biologi- 
cal mechanisms of evolution to develop strategies for solving optimization problems [6]. 
The search takes place often on a rugged landscape which is a geometric represen- 
tation of the optimality of all possible solutions[5]. Populations of individuals (can- 
didate solution to a given problem) evolve using operators inspired in the theory of 
evolution by natural selection. 

In genetic algorithms (GA)[3, 2, 5], populations are formed by a constant num- 
ber of individuals N described by 5-bit sequences. From a given population at a 
given generation, the next generation is obtained by applying the selection operator 
followed by the crossover operator and the mutation operator. In standard GA, the 
selection operator chooses individuals with a probability proportional to its fitness, 
but this can sometimes lead to premature convergence. In this sense, the selection 
operator strongly constrains the convergence velocity of evolutionary algorithms. To 
avoid these problems, other operators such as rank selection or tournament selection 
can be used [5j. Next, pairs of individuals are randomly combined with a crossover 
probability p c by randomly choosing the crossover point. Finally, each bit of each 
individual is mutated with a mutation probability Pm[5]. 

As a result of their simplicity and flexibility, GA's can be applied to a large set 
of combinational optimization problems with a large space of candidate solutions, 
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although the intrinsic metastability that characterizes their dynamics has been only 
recently understood in theoretical terms. In optimization problems with several lo- 
cal extrema, the GA*s are widely used because of its general purpose to find a 
good approximation^]. However, instead of the Darwinian, short-term evolution- 
ary metaphor, a different time scale can be considered, the macroevolutionary one, 
where extinctions and diversification of species through internal interactions are at 
work. Large extinctions can generate coherent population responses very different 
from the slow, Darwinian dynamics of classical G A. Besides, the population of can- 
didate solutions/species might be understood in terms of an ecological system with 
connections among different species, instead of just a number of independent entities 
with a given assigned fitness value. 

In this paper, a simple model of macroevolution will be used as a starting point 
to solve global-extrcmization problems. Our method will be called a macroevolu- 
tionary algorithm (MA) and it will be compared with GA's through some statistical 
measures obtained from numerical experiments with seven standard multivalued 
functions. A theoretical approach to a specific (though relevant) case will be con- 
sidered in order to understand the dynamics of this system. It will be shown that 
the basic mechanism involves competition among fitness peaks with an eventual 
symmetry-breaking process if two peaks are identical or very close in fitness. 



3 Brief Description of Drawings 

FIG. 1 shows a vector field in the (N k t W r )-space for the mean-field theory of the two- 
peak MA problem, as defined by equations (19-20). Here we use 17 = 0.5, £ = 0.01, 
P=l,£) = l ) r = 0.01. The system flows towards a single attractor (F, 0) which is 
globally stable. 

FIG. 2 shows a shows a vector field in the (N k , N r ) -space for the mean-field 
theory of the two-peak MA problem for the marginal problem r\ ~ 0 (two identical 
peaks). Here the system is described by a couple of linear equations (21-22) which 
lead to an infinite number of stable solutions, as given by the line N r = 1 - N k . 

FIG. 3 shows a landscape of two 2-input functions. FIG 4. shows four stages in 
the evolution of population applying MA to example 1 (P = 50> G = 100.) Each o 
indicates an individual represented in input space, A are local maxima. 

FIG. 5 shows four stages in the evolution of population applying MA to example 2 
(P = 50,G = 100). Each o indicates an individual represented in input space, A. 

FIG. 6 shows a parameter behavior for the MA in example 1 (6A, 6B, 6C) and 
example 2 (6D, 6E, 6F). (G = 400, P = 50, R = 50.) 
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FIG. 7 shows a mean fitness value about all population in each generation for a 
run (G = 100, P = 100 for GA and P = 50 for MA.) 

FIG. S shows the best individual fitness in each generation for three runs using 
example 1 (8A, 8B) and example 2 (8C,8D). (G = 100, P = 100 for GA and P = 50 
for MA.) 

FIG. 9 shows different measures according to population size for both GA and 
MA for example 1 (9A, 9C) and example 2. (10B) (Ffrom 50 to 400 step 10, G - 400, 
R = 50.) 

FIG. 10 shows a relation between fitness value reached, probability of success to 
reach a good fitness, and time required, applying GA and MA to examples 1 (10A) 
and 2 (10B). 

FIGS, lla-lld shows a relation between fitness value reached and time needed, 
applying GA and MA to examples 2 to 6. 

FIGS. 12a-b shows a relation between fitness value reached and time needed, 
applying GA and MA to examples 7 and 8. 

FIG. 13 discloses a representative computer system 1310 in conjunction with 
which the embodiments of the present invention may be implemented. 

Detailed Description of the Preferred Embodiment 
4 Model of Macroevolution 

The biological model of macroevolution (MM) [8] allows to stimulate the dynamics 
of species extinction and diversification for large time scales[8, 10]. The model used 
in our study is a network ecosystem where the dynamics is based only on the relation 
between species. The links between units/species are essential to determine the new 
state (alive or extinct) of each species at each generation. We can define the state 
of species i in the generation t as 

q (*\ _ / 1 if state IS "alive" 

* m ~ \ 0 if state is "extinct" 1 ' 

Moreover, the model considers that time is discretized in "generations" and that 
each generation is constituted by a set of P species where P is constant. The re- 
lationship between species is represented by a connectivity matrix W, where each 
item Wjj(i) € {1,...,P}) measures the influence of species j on species t at 
t with a continuous value within the interval [— 1,+1] (in ecology, this influence is 
interpreted as the trophic relation between species). At the end of each generation, 
all extinct species will be replaced by the species. Briefly, each generation in the 
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biological model consists on a set of steps (the rules) which will be translated to the 
MA model (see below): 

1. Random variation: For each species i, a connection Wij(t) is randomly chosen, 
and a new random value between -1 and 1 is assigned. 

2. Extinction: The relation of each species with the rest of the population deter- 
mines its survival coefficient h defined as 

Mt) = £)Wu(t) (2) 

where t is the generation number. The species state in the next generation is 
synchronously updated: 

° tKl + L) ~ \ 0 (extinct) otherwise w 
This step allows for the selection and extinction of species. 

: 3. Diversification: We colonize vacant sites freed by extinct species with surviving 
' species. Specifically, a colonizer c will be randomly chosen from the set of sur- 
vivors. For all vacant sites (i.e. those such that Sjt(t) = 0) the new connections 
will be updated in this way: 

Wk = Wk + ifc* (4) 
where rj is a small random variation and Sk(t + 1) = 1. 

This model was shown to reproduce most of the statistical features of macroevolution 
[8, 10J and it provided a natural source for the decoupling between microevolution 
(i.e. those processes modeled by genetic algorithms involving selection and contin- 
uous adaptation) and macroevolution (i.e. those processes involving extinction and 
diversification) [9]. Because of the essentially different nature of this approximation, 
we could ask which kind of outcome would be expected from the application of this 
model to optimization problems. As we will see, the macroevolution model allows to 
construct a simple but powerful method of optimization which is able to outperform 
GAs in several relevant examples. 
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5 Macroevolutionary Algorithm 

In this section we show how to map the previous model of extinction into a model of 
optimization. The new model will be called a macroevolutionary algorithm (MA). Let 
us define the d-dimensional fitness function / as a multidimensional function that 
we want to maximize. Our objective is to find the best values for the rf-dimensional 
vectors of our problem under consideration. Thus our individuals/species are now 
Si = p € Q 6 Sft**, i.e. d-dimensional objects constrained to a subspace ft. In this 
context p will be a good approximation if Vq : /(q) < /(p) + e where p and q are 
individuals and e > 0 is a given error threshold. 

In this way each individual in MA is described by a d-input vector with fitness 
/. The domains for these inputs describe the search space where our fitness function 
is nothing but a (more or less) rugged landscape. 

As with GA, MA uses a constant population size of P individuals evolving in 
time by successive updates of the given operators. The main idea is that our system 
will choose, through network interactions, which are the individuals to be eliminated 
so as to guarantee exploration by new individuals and exploitation of better solu- 
tions by further generations. To this purpose, it is essential to correctly establish a 
relationship between individuals. This is described by the following criteria: 

(a) Each individual gathers information about the rest of the population through 
the strength and sign of its couplings VJ^y. Individuals with higher inputs ki will be 
favoured. Additionally, they must be able to outcompete other less-fit solutions. 

(b) Some information concerning how close two solutions are in U is required (al- 
though this particular aspect is not strictly necessary). Close neighbors will typically 
share similar /-values and will cooperate. In this context, we can define the connec- 
tion Wij as: 



where p» = (p\ , ...,P?) are the input parameters of the t-th individual. The numer- 
ator is a consequence of our first criterion, while the second criterion is introduced in 
the denominator as a normalization factor that weights the relative distance among 
solutions. 

Now we can define the most important ingredients that will be used in building 
the set of operators to be applied each generation: 

1. Selection operator: It allows to calculate the surviving individuals through 
their relations, i.e. as a sum of penalties and benefits. The state of a given 
individual Si will be given by: 




(5) 



5 



SUBSTITUTE SHEET (RULE 26) 



WO 00/38079 



PCT/US99/30641 



Si(t+l) = 



1 ifX&lWy(t)>0 



(6) 



0 otherwise 



where t is generation number and Wij — W(p : -, pj) is calculated according to 
equation (5). In the following this rule will be indicated as S»(i + 1) = 0(ki(t)) 
where 0(z) — 1 if z > 0 and zero otherwise. Additionally, if the distance 
between two solutions is zero then we set Wij(Dij = 0). 

2. Colonization operator: It allows to fill vacant sites freed by extinct individuals 
(that is, those such that $ — 0). This operator is applied to each extinct 
individual in two ways. With a probability r, a totally new solution p n G ft will 
be generated. Otherwise exploitation of surviving solutions takes place through 
colonization. For a given extinct solution p if we choose one of the surviving 
solutions, say p b . Now the extinct solution will be "attracted" towards p&. 

Mathematically, a possible (but not unique) choice for this colonization of 
extinct solutions reads: 



where f € [0, 1] is a random number, A € [— (both with uniform dis- 
tribution) and p and r are given constants of our algorithm. So we can see 
that p describes a maximum radius around surviving solutions and r acts as 
a temperature. 

Although all essential rules defining the MA have been presented, several im- 
provements and additional rules have been explored. In particular, we can decrease 
r with time as in simulated anncaling[4 I 7, 11] to get a good convergence. In this 
context, the "temperature" r, when lowered, provides a decrease in randomness 
which favours the exploitation around the best individual found. In order to lower 
r in each generation, we can use a given decreasing function. Here we have used a 
linear relation: 



where G is number of generations. The results of using this linear annealing 
procedure do not strongly differ from other choices of r(t). 




(7) 



r(t;<7) = l-i 



(8) 
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6 Mean-field theory 

In this section we will introduce a qualitative approximation to the basic MA be- 
havior, which is based on a competition process among the largest peaks, eventually 
followed by a symmetry- breaking phenomenon when the peaks are very similar. 

To make our argument clear, let us assume the simplest case with two peaks 
with fitness located ^at two given points p* and p r . For simplicity we will 

assume that 0 < /£, /* < 1. A given number of solutions will be clustered around 
these peaks. Let us call T* and T T the sets of solutions around each peak, i.e. we will 
assume that if p 0 ,P6 S then /(p 0 ) « /(p&) & Let D r t the distance between 
peaks, and let us assume that /£ — /* = 17 > 0. Let N k (t) and /^(t) the number of 
solutions at each peak at a given step t. 

The two operators act on each population around each peak. We can describe 
the time evolution of our system in terms of two steps: 

N k {t + l) = V PtT [£ WiS {N k (t)j\ (9) 

where £ and V indicate extinction and diversification operators, respectively. Let us 
first introduce the extinction step. For the r* population we have: 

N k (t + 1/2) = £ WiJ {N k (t)) = £ Pi )) (10) 

where we use the notation <7 M (pi, p,*)) s Wij- So the final expression for the r* 
population is: 



tj€V k °\P) j€T r 

And an equivalent expression can be derived for the second peak: 



iV*(t+l/2) = 



Wit + 1/2) = £ 0\Z t£t + £ G»(Pi,Pj) 
i€r r Ljerv °\P) ier* 



(11) 



(12) 



where both populations have been explicitly separated. In our approximation, we 
assume that the Tk set is formed by a closely related set of solutions, which are 
close inside a domain of size M(r*) and radius 5(p). This radius clearly depends 
in some simple way on rule (2). Here we just assume that 6 is small (after the 
clustering process). There are two terms in the previous equation which involve 
interactions among solutions in the same cluster and interactions among solutions 
of both clusters. We can assume that the first term is small compared with the cross- 
interaction term. This approximation will of course depend of the specific properties 
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of the landscape but is reasonable in terms of our previous assumptions. Here we 
also assume that the number of random solutions generated by rule (2) (which will 
depend on r) is small and so we can approximate N k + N* « P. 

A mean-field set of equations (i.e. a Erst approximation ignoring correlations) 
can be derived from the previous approximation for the extinction operator. The 
basic equation for N k is: 



N k (t + 1/2) = JV*(l - P[6{z) < 0]) 



(13) 

and an equivalent expression for N r (t + 1/2). Now we can use the following (mean 
field) approximation: 



(*(G,foi.I*))) r «^*0l) 



(14) 



where z = k, r. The function g x (ii) involves the specific response of each population 
to the difference in fitness between peaks and D r ,k stands for the distance among 
maxima. 

The following pair of nonlinear discrete maps is obtained: 



JV*(t+ 1/2) = N k {t) 



N T (t + 1/2) = N T (t) 



l-Sitfo) 



J\T(t) 



1-ffrfo) 



PD Tk 

N k {t) 
PD rk 



(15) 



(16) 



Let us note that, strictly speaking, an additional equation for the N n (t) -population 
(the one formed by randomly generated solutions) should also be included. Here 
we assume that it is very small and typically formed by low-fit individuals. The 
functions g z (v) must fulfil some basic requirements. Several choices are possible 
(and all of them lead to the same basic results). The simplest is: gk(w) = v(fk ~ V) 
and g r (rj) — t?/J. The first pt-function tells us that the IVpopulation will have a 
negative effect on I\ proportionally to its fitness but also to the difference between 
peaks. As this difference increases, it is less likely for the smaller peak to have some 
harmful effect on IV For simplicity, since all the dynamics depends on the difference 
between peaks, we will use = 1. 

The next step in our analysis involves diversification. Following our previous 
approximations, it is not difficult to show that the equations for this second step 
read: 

N k (t + 1) = N k {t + 1/2) + $*(P -N k {t + 1/2) - N r {i + 1/2)) (17) 
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JTit + 1) = AT [t + 1/2) + tf r (P -AT*(t+ 1/2) - N*{t + 1/2)) (18) 

where the coefficients * x involve both the colonization of H by the survivors towards 
the domain, as well as the generation of new, random solutions. It is not difficult 
(although tedious) to show that these coefficients are: = rj(l — r) + &r ^ r = 
x){\ - 7})(\ - r) +&-T. The second term in this quantities indicate the likelihood that 
a random solution falls into one of the V sets. Explicitly, = n(r t )/p(Q) t where 
fi(z) is the measure (i.e. hypervolume) of each set. 

After some algebra, using equations (15-18) and the continuous approximation 
N z (i + 1)-N*{t) « dN k /dt, the full equations for the MA mean field dynamics are 
obtained: 

dN k / v 

Sg- = - N*) + N k (v k {p(2 - n)7\T - 1) - 0(1 - i,)JV) (19) 

^ = * r (P - **) + N T (%{0{2 - r?)iV* - 1) -0(1 - T7)N*) (20) 
where 0 = 7?/PD r *. 

This dynamical system can be studied in terms of standard stability analysis [1], 
It can be easily shown that the only fixed points here are (P,0) and (0, P). It is 
worth mentioning that our model involves a competition process which is different 
from a standard Lotka-Volterra (LV) model of two-species competition [6]. The LV 
model is defined by two nonlinear differential equations dx/dt = p\y (P— i— fay) and 
dy/dt — ^(P-y-fox), where x, y are the population sizes, /i is the growth rate of 
each one and ft > 0 the competitionxoefficients. The LV model also involves the two 
so-called exclusion points (P,0) and (0, P) but also two additional ones, (0,0) and 
(x* > 0, y* > 0). The last one implies that coexistence among competitors can take 
place for some parameter combinations. In the model (19-20) no such coexistence 
can take place. This is important, as far as for optimization purposes no coexistence 
among different peaks is desired. 

The Jacobi matrix Jj,- = {dWjdN*) for our model can be computed for the 
two fixed points and the eigenvalues are such that, for rj > 0, A±(P,0) < 0 and 
A±(0,P) > 0, so a stable and an unstable node are associated to (P,0) and (0,P) 
respectively. In fact it can be shown that (P, 0) is globally stable, attracting all initial 
conditions towards T r (figure 1). 

A particular case is given by r/=0, where the same fitness is obtained for each 
peak. In terms of optimization, this is not a major problem, since we want to find 
the best solution. The vector field for this case is also shown in Figure 2, where we 
can see that the trajectories converge to a full line given by N* = P — N k . This is 
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what we could expect if we introduce 17 = 0 in the previous equations. This leads to 
a linear system of equations 

^ = &t(P-JV-W*) (21) 

*2L = S T r(P-N'-N k ) (22) 

at 

which shows that, for the two-cluster problem with N n ~ 0, the dynamics settles 
in a final state where a fraction of the population clusters around one peak and the 
other around the second. Numerical evidence shows, however, that this is not the 
case: symmetry-breaking (i.e. the choice among one of the two equal peaks) typically 
occurs: small fluctuations and the presence of random solutions eventually shifts the 
system towards one of the two peaks. 



7 Numerical Results 

A number of numerical experiments have been performed in order to study the 
behavior of MAs and their superiority in relation with standard genetic algorithms. 
Several functions have been used to this purpose, and are listed below 1 . 

(1) Two 2-input (i.e. two-dimensional) functions (see Figure 1) of the form 

The two specific examples are: 
(a) A landscape S\ with some local maxima of close heights where m = 10 and 
x e [0, 100] 2 (see Fig. 3) The location of the maxima is given in the following table: 



Pi 


(35,85) 


(75,75) 


(25,30) 


(45,45) 


(80,55) 




(65,55) 


(25,65) 


(85,15) 


(90,90) 


(70,10) 


Oi 


40 


55 


75 


99 


85 




95 


85 


65 


92 


35 




35 


30 


45 


55 


60 




20 


70 


40 


40 


55 



(b) A landscape fa with infinite maxima around the global maximum where m — 2 
andf € [0.100] 2 (figure 3.) 

a For all the numerical experiments, a Sun Ultra- 1 workstation with SunOS operating system 
has been used for all the numerical experiments. 
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10 



15 



20 



25 



30 



35 



40 



45 



Pi 


(50,50) 


(50,50) 


(50,50) 


« 


750 


-720 


35 


<* 


500 


425 


25 



(2) In order to obtain a standard comparison with previous studies, we have 
also used several multivalued functions proposed in the contest held during the 1996 
IEEE International Conference on Evolutionary Computation for real valued spaces 
containing five TV-input functions: 
(c) The sphere model where x { 6 [-5, 5] with N = 10: 



/«(*) = -£(*-!)' 

(d) Griewank's function where i,- e [-600, 600] with N = 10: 

m = ~ -m |> - 100 > 2 + n «- ) + 1 

(e) Shekel's foxholes where x< € [0, 10) with N = 10: 

AGO -El 



(24) 



(25) 



(26) 



(27) 



(28) 



(f) Langerman's function where Xj € [0, 10] with N = 10: 

/•(*) = Ec j eil |f -^ i »U ? cos( J r||x-/l(t)|| 2 ) 

(g) Michalewicz's function where X{ g [0, n] with N = 10: 

/7(^) = Esin(x i )sin 20 (^) 

(h) Rotated Michalewicz's function version where x» € [0, tt] with N = 10 and a = |: 

/8(f) = / 7 (rtaatam(f)) 
where Rotation means to perform N — 1 rotations of a radians centered at point 

V 2 » * * * * 2 f" 

As a first approximation to numerically explore MA's behavior, we have repre- 
sented (as points on the Q domain) scores of local maxima and the population distri- 
bution at four different time steps for the two-input functions f\ and it (see FIGS. 
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2 and 3). The role of randomness is shown graphically with this 2D-representation: 
Initially, MA explores the input space looking for good individuals, which are clus- 
tered around local maxima (in the figures, stage with t — 10). Afterward (see the 
plot at t = 30 and t = 45) competition between maxima takes place by attracting 
more individuals/solutions. Finally, at stage t — 70, it is expected that the global 
maximum wins because it captures all individuals. 

In order to understand how the two parameters p and r influence MA behavior, 
three types of experiments were performed for fi and fa using equal population sizes 
for both functions (P = 50) over G = 400 generations and for R = 50 independent 
runs. Explicitly, we considered the following situations: (a) p is varied using values in 
[0, 1], while t is fixed; (b) p is varied using values in [0, 1], and r is varied using linear 
simulated annealing according to the equation (8) and (c) r is varied using values 
[0, 1], while p is fixed. In order to quantify the differences, the following measures 
were used: (1) Mean of best fitness value reached; (2) Mean number of generations . 
needed in order to reach a good fitness value (98.0199 for fi , and 64.3500 for / 2 ) 
and (3) Probability of success in reaching a good fitness value /*, measured as the 
relative number of runs that have readied 

The basic results of these experiments are summarized in Figure 4. We can see 
that there is no well-defined optimum for neither r nor p. Typically a wide range of 
parameter values gives similar results. These results allow us to reduce the number 
of parameters which should be tuned in our algorithm to just two: population size 
P and number of generations G to be calculated. 

Actually, in most cases simulated annealing seems to be the best choice, being the 
specific /rvalue less important. For this reason, MA with linear simulated annealing 
will be used in our study. In all our simulations we will use p = 0.5. In FIG. 5, 
we compare MA with and without simulated annealing by measuring the mean 
fitness for all solutions in the population in each generation: for MA with simulated 
annealing (i.e. with reduced exploration in favour of exploitation) the mean tends 
monotonically to the optimum, while for r constant, the population shows lower 
fitness values, close to GA results. 

To further simplify the comparison between MA and GA, standard values of p c = 
0.7 and p m = 0.001 for the genetic algorithm will be used (other reasonable choices 
gave equivalent statistical results). In the GA model, individuals are represented by 
a 32-bit sequence. Moreover, due to the fact that GA shows premature convergence 
for some functions, another selection operator, tournament selection [5] (choosing 
the best of two random individuals with a probability of, say, 0.75) will be used. 

A first comparison between GA and MA — with and without simulated annealing — 
is shown in FIG. 6 where single runs using the three methods are plotted. Here FIGS. 
6A, 6B, 6C and FIGS. 6D, 6E, 6F correspond to runs of the first and second examples 
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of our list, respectively. These pictures shows that, while MA converges progressively 
with time, the GA tends to show considerable fluctuations. 

In order to mesure and compare their performance, several extensive simulation 
experiments were used (see Figure 9) for /i and / 2 using different population sizes 
(P € [50, 400]) over G = 400 generations and using R - 50 independent runs. The 
following measures were used: 

(a) Mean of best fitness value reached after R runs, (b) Mean number of generations 
required to reach a good fitness value (/* = 98.0199 for f u and /• = 64.3500 for / 2 ). 
(c) Probability of reaching a good fitness value, (d) The time used (in clock ticks) 
to reach a good fitness value. 

Many different simulation experiments were performed in order to compare the 
advantages of MA in relation with GA with tournament selection. The main results 
and conclusions from our simulations are: 

1. Macroevolutionary algorithms reach higher fitness values than GAs for equal 
population sizes. The difference between both approaches decreases if P is 
large enough (simply due to fact that MA have already reached the optimum 
while GA continues exploring). 

2. The probability of success in reaching a good fitness value along G generations 
in a typical run is higher in MA than in GA. 

3. The time needed to reach the optimum using the same population size is lower 
in MA if the population size is small. But for large P (see Figure 9), the time 
required in each iteration by MA can become larger than for GA (order P 2 vs 

n 

The asymptotic time difference is due to the selection operator, which is compu- 
tationally more .expensive. In standard GA, this operator chooses individuals with 
a probability proportional to their fitness in a time scale of order 0(P log P), but 
if tournament selection is used, it is only 0(F). In contrast, MA involves a time 
scale of order 0(P 2 ). For this reason, a comparison of these measures for both al- 
gorithms with equal populations P is not adequate. A alternative could be to apply 
a readjustment factor to P in terms of time cost. However the relevance of small 
values for P in the performance of MA makes the asymptotic comparison useless. 
However, if the distance between candidate solutions is removed from the definition 
of the couplings (5) a considerable increase in MA performance is obtained, although 
an extensive re-analysis of our results will be required in order to compare the two 
cases. 

A different approximation can be used. We could be interested in faster but less 
accurate solutions, or in larger fitnesses but slower convergence. A good comparative 
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criterion should allow us to explore this compromise between fitness values and 
the time needed to reach them. To this objective, several experiments have been 
performed — varying P and G for different sample sizes (R) — in which both the 
fitness of the best solution reached and time required are measured. These measures 
have been sorted and averaged over time, and the success probability calculated. 
The basic results of these simulations are summarized in the next table and in in 
figures 8, 9 and 10 



Funct. 


Alg. 


P 


AP 


G 


AG 


R 


time interval 


h 


GA 


50 to 400 


20 


50 to 400 


10 


15 


25 


h 


MA 


50 to 250 


10 


50 to 400 


10 


15 


25 


/2 


GA 


50 to 400 


20 


50 to 400 


10 


15 


25 




MA 


50 to 250 


10 


50 to 400 


10 


15 


25 


h 


GA 


220 to 480 


20 


500 to 5000 


300 


5 


600 


h 


MA 


20 to 150 


20 


500 to 1500 


300 


5 


125 


h 


GA 


120 to 270 


20 


300 to 5000 


300 


5 


600 


h 


MA 


20 to 240 


20 


300 to 5000 


100 


5 


400 


h 


GA 


150 to 380 


20 


500 to 6000 


300 


5 


600 


h 


MA 


50 to 240 


20 


800 to 6000 


300 


5 


400 


h 


GA 


150 to 380 


20 


500 to 6000 


300 


5 


2500 


h 


MA 


50 to 240 


20 


800 to 6000 


300 


5 


2000 


h 


GA 


150 to 380 


20 


500 to 6000 


300 


5 


1200 


h 


MA 


50 to 210 


20 


800 to 6000 


300 


5 


1500 


h 


GA 


150 to 380 


20 


500 to 6000 


300 


5 


2000 


h 


MA 


50 to 210 


20 


800 to 6000 


300 


5 


2000 



These experiments shows that, for all the used test functions used —except 
Michalewicz's function / 7 — the MA reaches better fitness values and has a higher 
success probability the GA. Michalewicz's function ft is a particular case where the 
geometric properties of the landscape favours GA. (However, a simple geometric 
transformation, like rotations of the function that keep the landscape shape, can 
strongly modify the GA performance). 

8 Conclusions 

In this paper we have introduced a novel optimization technique which we have 
called macroevolutionary algorithm. It is based in a simple procedure inspired in 
macroevolutionary responses of complex ecosystems to extinction events. In the 
original model, extinctions removed some species and new ones were generated by 

14 



SUBSTITUTE SHEET (RULE 26) 



WO 00/38079 



PCT/US99/30641 



diversification of the survivors. In the MA approach, the basic ecology-like structure 
is also preserved, but it is now applied to a set of candidate solutions to a given 
optimization problem on a fitness landscape. The survival of species/solutions is 
linked with the relative fitness as given by /(x) in relation with all the other species. 
These differences define the strength of their couplings, weighted by their mutual 
distance in search space ft. If the total sum of input connections to a given species 
is positive, it survives. If negative, it disappears from the system. In this sense, the 
number of removed solutions is not fixed (as it happen to be the case for standard 
GA) but strongly dynamical. Sometimes, a large (mass-) extinction event takes place 
when a very good solution is found. The replacement process guarantees both the 
exploitation of the high-fit solutions as well as further, random exploration of other 
domains of the landscape. Because of the connection matrix, the whole population 
is able to obtain a rather accurate map of the relative importance of the solutions 
being explored in the landscape. 

There are two types of situations to be considered according to whether or not 
simulated annealing is used. The first case is appropiate when we want to find a 
good solution in a given number of generations and also guarantee that such a 
solution is a local optimum within a given confidence. In this case, we should use 
the linear annealing to increase the exploitation over the last generations. In this 
way we guarantee that mass extinction will occur when all individuals have reached 
local optimum. In the second case, when a constant value for r is used, we keep 
the same exploration rate along time. This could be interesting when fitness is a 
time-varying function. 

Moreover, other changes have been considered in the algorithm with no quali- 
tative differences: (a) The colonizer b can be chosen by the colonization operator 
with proportional fitness probability, instead of using just the best individual. Fur- 
thermore, we can add the possibility for each extinct individual to choose its own 
colonizer; (b) To use a different probability distribution (instead of the uniform one) 
to generate A in (7). 

Numerical simulations have shown that MA typically outperformed GA over a 
wide range of conditions. Together with a typically monotonous convergence, the 
interactions between individuals in MA (through the connection matrix WiJ) make 
exploration in input space much more satisfactory. Moreover, MA is being succesfully 
applied to optimization problems that can be formulated in terms of an optimization 
function even if it is highly multi-modal or highly multi-dimensional For example, 
MA can be successfully applied to learning in neural networks as a training algorithm 
and as a multidimensional scaling method. Following this aproach, MA could be 
applied to combinational optimization problems, although the main difficulty is to 
express the problem in terms of relations between individuals. Further work should 
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explore the efficiency of the MA in correlated/uncorrelated landscapes. The mean- 
field approximation to the simple two-peak problem shows that the best solution is 
always a global attractor of the dynamics. Due to (i) the antisymmetric nature of the 
connectivity matrix, (ii) the numerical observation that a peak-to-peak competition 
process takes place through the evolution and (iii) the monotonous change implicit 
in the competition dynamics, we conjecture that a general analytic treatment of 
MAs can be derived. 

FIG. 13 discloses a representative computer system 1310 in conjunction with 
which the embodiments of the present invention may be implemented. Computer 
system 1310 may be a personal computer, workstation, or a larger system such as a 
minicomputer. However, one skilled in the art of computer systems will understand 
that the present invention is not limited to a particular class or model of computer. 

As shown in FIG. 13, representative computer system 1310 includes a central 
processing unit (CPU) 1312, a memory unit 1314, one or more storage devices 1316, 
an input device 1318, an output device 1320, and communication interface 1322. A 
system bus 1324 is provided for communications between these elements. Computer 
system 1310 may additionally function through use of an operating system such as 
Windows, DOS, or UNIX. However, one skilled in the art of computer systems will 
understand that the present invention is not limited to a particular configuration or 
operating system. 

Storage devices 1316 may illustratively include one or more floppy or hard disk 
drives, CD-ROMs, DVDs, or tapes. Input device 1318 comprises a keyboard, mouse, 
microphone, or other similar device. Output device 1320 is a computer monitor or 
any other known computer output device. Communication interface 1322 may be a 
modem, a network interface, or other connection to external electronic devices, such 
as a serial or parallel port 

While the above invention has been described with reference to certain preferred 
embodiments, the scope of the present invention is not limited to these embodiments. 
One skill in the art may find variations of these preferred embodiments which, 
nevertheless, fall within the spirit of the present invention, whose scope is defined 
by the claims set forth below. 
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Claims 

1. A system for performing optimization on a fitness 
landscape comprising: 

an evolutionary model comprising: 

a plurality of species representing a 
corresponding plurality of candidate solutions to an optimization problem; and 

a plurality of rules governing a behavior of said species, said 
rules comprising: 

at least one connectivity rule 
establishing a plurality of relations among said species; 

at least one extinction rule defining an 
an extinction of said species; and 

at least one colonization rule defining a 
filling of vacancies caused by said extinction; and 

a simulator for executing said model to generate 
one or more optimal solutions to the optimization problem. 

2. A system as in claim 1 wherein said at least 

one connectivity rule randomly chooses connections among said 
species. 

3. A system as in claim 1 wherein said at least 

one connectivity rule assigns a connectivity value randomly 
chosen from -1 to 1 to said connections among said species. 

4. A system as in claim 3 wherein said extinction rule defines an extinction 
of each of said species in accordance with said connectivity value. 

5. A system as in claim 1 wherein said at least one colonization rule defines 
said filling of said vacancies by either a new one of said species or an existing one 
of said species. 

6. A system as in claim 5 wherein said at least one colonization rule defines 
an attraction between a surviving one of said species and an extinct one of said 
species, said attraction is proportional to a distance between said surviving species 
and said extinct species. 

7. A system as in claim 1 wherein said at least one colonization rule comprises 
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at least one parameter that influences the filling of vacancies. 

8. Computer executable software code stored on a computer readable medium, 
the code for performing optimization on a fitness landscape, the code comprising: 

code to model an optimization problem comprising: 
code to represent a plurality of candidate 
solutions to the optimization problem with a plurality of 
species; and 

code to govern a behavior of said species, 
said code to govern comprising: 

code to establish a plurality of 
relations among said species; 

code to define an extinction of said species; and 

code to define a filling of vacancies 
caused by said extinction; and 

code to execute said model to generate one or more 
optimal solutions to the optimization problem. 

9. A programmed computer system for performing optimization as a fitness 
landscape comprising at least one memory having at least one region storing com- 
puter executable program code and at least one processor for executing the program 
code stored in said memory, wherein the program code comprises: 

code to model an optimization problem comprising: 
code to represent a plurality of candidate 
solutions to the optimization problem with a plurality of species; and 

code to govern a behavior of said species, said code to govern com- 
prising: 

code to establish a plurality of 
relations among said species; 

code to define a survival of said 

species; and 

code to define a filling of vacancies 
caused by said extinction; and 

code to execute said model to generate one or more optimal solutions to the 
optimization problem. 
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